101 research outputs found
Tiled Multiplane Images for Practical 3D Photography
The task of synthesizing novel views from a single image has useful
applications in virtual reality and mobile computing, and a number of
approaches to the problem have been proposed in recent years. A Multiplane
Image (MPI) estimates the scene as a stack of RGBA layers, and can model
complex appearance effects, anti-alias depth errors and synthesize soft edges
better than methods that use textured meshes or layered depth images. And
unlike neural radiance fields, an MPI can be efficiently rendered on graphics
hardware. However, MPIs are highly redundant and require a large number of
depth layers to achieve plausible results. Based on the observation that the
depth complexity in local image regions is lower than that over the entire
image, we split an MPI into many small, tiled regions, each with only a few
depth planes. We call this representation a Tiled Multiplane Image (TMPI). We
propose a method for generating a TMPI with adaptive depth planes for
single-view 3D photography in the wild. Our synthesized results are comparable
to state-of-the-art single-view MPI methods while having lower computational
overhead.Comment: ICCV 202
Temporally Consistent Online Depth Estimation Using Point-Based Fusion
Depth estimation is an important step in many computer vision problems such
as 3D reconstruction, novel view synthesis, and computational photography. Most
existing work focuses on depth estimation from single frames. When applied to
videos, the result lacks temporal consistency, showing flickering and swimming
artifacts. In this paper we aim to estimate temporally consistent depth maps of
video streams in an online setting. This is a difficult problem as future
frames are not available and the method must choose between enforcing
consistency and correcting errors from previous estimations. The presence of
dynamic objects further complicates the problem. We propose to address these
challenges by using a global point cloud that is dynamically updated each
frame, along with a learned fusion approach in image space. Our approach
encourages consistency while simultaneously allowing updates to handle errors
and dynamic objects. Qualitative and quantitative results show that our method
achieves state-of-the-art quality for consistent video depth estimation.Comment: Supplementary video at
https://research.facebook.com/publications/temporally-consistent-online-depth-estimation-using-point-based-fusion
Layered 3D: tomographic image synthesis for attenuation-based light field and high dynamic range displays
We develop tomographic techniques for image synthesis on displays composed of compact volumes of light-attenuating material. Such volumetric attenuators recreate a 4D light field or high-contrast 2D image when illuminated by a uniform backlight. Since arbitrary oblique views may be inconsistent with any single attenuator, iterative tomographic reconstruction minimizes the difference between the emitted and target light fields, subject to physical constraints on attenuation. As multi-layer generalizations of conventional parallax barriers, such displays are shown, both by theory and experiment, to exceed the performance of existing dual-layer architectures. For 3D display, spatial resolution, depth of field, and brightness are increased, compared to parallax barriers. For a plane at a fixed depth, our optimization also allows optimal construction of high dynamic range displays, confirming existing heuristics and providing the first extension to multiple, disjoint layers. We conclude by demonstrating the benefits and limitations of attenuation-based light field displays using an inexpensive fabrication method: separating multiple printed transparencies with acrylic sheets.Dolby Laboratories Inc.Samsung ElectronicsAlfred P. Sloan Foundatio
Single lens off-chip cellphone microscopy
Within the last few years, cellphone subscriptions have widely spread and now cover even the remotest parts of the planet. Adequate access to healthcare, however, is not widely available, especially in developing countries. We propose a new approach to converting cellphones into low-cost scientific devices for microscopy. Cellphone microscopes have the potential to revolutionize health-related screening and analysis for a variety of applications, including blood and water tests. Our optical system is more flexible than previously proposed mobile microscopes and allows for wide field of view panoramic imaging, the acquisition of parallax, and coded background illumination, which optically enhances the contrast of transparent and refractive specimens
Tensor displays: compressive light field synthesis using multilayer displays with directional backlighting
We introduce tensor displays: a family of compressive light field displays comprising all architectures employing a stack of time-multiplexed, light-attenuating layers illuminated by uniform or directional backlighting (i.e., any low-resolution light field emitter). We show that the light field emitted by an N-layer, M-frame tensor display can be represented by an Nth-order, rank-M tensor. Using this representation we introduce a unified optimization framework, based on nonnegative tensor factorization (NTF), encompassing all tensor display architectures. This framework is the first to allow joint multilayer, multiframe light field decompositions, significantly reducing artifacts observed with prior multilayer-only and multiframe-only decompositions; it is also the first optimization method for designs combining multiple layers with directional backlighting. We verify the benefits and limitations of tensor displays by constructing a prototype using modified LCD panels and a custom integral imaging backlight. Our efficient, GPU-based NTF implementation enables interactive applications. Through simulations and experiments we show that tensor displays reveal practical architectures with greater depths of field, wider fields of view, and thinner form factors, compared to prior automultiscopic displays.United States. Defense Advanced Research Projects Agency (DARPA SCENICC program)National Science Foundation (U.S.) (NSF Grant IIS-1116452)United States. Defense Advanced Research Projects Agency (DARPA MOSAIC program)United States. Defense Advanced Research Projects Agency (DARPA Young Faculty Award)Alfred P. Sloan Foundation (Fellowship
Multisource Holography
Holographic displays promise several benefits including high quality 3D
imagery, accurate accommodation cues, and compact form-factors. However,
holography relies on coherent illumination which can create undesirable speckle
noise in the final image. Although smooth phase holograms can be speckle-free,
their non-uniform eyebox makes them impractical, and speckle mitigation with
partially coherent sources also reduces resolution. Averaging sequential frames
for speckle reduction requires high speed modulators and consumes temporal
bandwidth that may be needed elsewhere in the system.
In this work, we propose multisource holography, a novel architecture that
uses an array of sources to suppress speckle in a single frame without
sacrificing resolution. By using two spatial light modulators, arranged
sequentially, each source in the array can be controlled almost independently
to create a version of the target content with different speckle. Speckle is
then suppressed when the contributions from the multiple sources are averaged
at the image plane. We introduce an algorithm to calculate multisource
holograms, analyze the design space, and demonstrate up to a 10 dB increase in
peak signal-to-noise ratio compared to an equivalent single source system.
Finally, we validate the concept with a benchtop experimental prototype by
producing both 2D images and focal stacks with natural defocus cues.Comment: 14 pages, 9 figures, to be published in SIGGRAPH Asia 202
Content-adaptive parallax barriers: optimizing dual-layer 3D displays using low-rank light field factorization
We optimize automultiscopic displays built by stacking a pair of modified LCD panels. To date, such dual-stacked LCDs have used heuristic parallax barriers for view-dependent imagery: the front LCD shows a fixed array of slits or pinholes, independent of the multi-view content. While prior works adapt the spacing between slits or pinholes, depending on viewer position, we show both layers can also be adapted to the multi-view content, increasing brightness and refresh rate. Unlike conventional barriers, both masks are allowed to exhibit non-binary opacities. It is shown that any 4D light field emitted by a dual-stacked LCD is the tensor product of two 2D masks. Thus, any pair of 1D masks only achieves a rank-1 approximation of a 2D light field. Temporal multiplexing of masks is shown to achieve higher-rank approximations. Non-negative matrix factorization (NMF) minimizes the weighted Euclidean distance between a target light field and that emitted by the display. Simulations and experiments characterize the resulting content-adaptive parallax barriers for low-rank light field approximation.National Science Foundation (U.S.) (grant CCF-0729126)National Research Foundation of Korea (grant 2009-352-D00232
Waveguide Holography: Towards True 3D Holographic Glasses
We present a novel near-eye display concept which consists of a waveguide
combiner, a spatial light modulator, and a laser light source. The proposed
system can display true 3D holographic images through see-through
pupil-replicating waveguide combiner as well as providing a large eye-box. By
modeling the coherent light interaction inside of the waveguide combiner, we
demonstrate that the output wavefront from the waveguide can be controlled by
modulating the wavefront of input light using a spatial light modulator. This
new possibility allows combining a holographic display, which is considered as
the ultimate 3D display technology, with the state-of-the-art pupil replicating
waveguides, enabling the path towards true 3D holographic augmented reality
glasses
BiDi screen: a thin, depth-sensing LCD for 3D interaction using light fields
We transform an LCD into a display that supports both 2D multi-touch and unencumbered 3D gestures. Our BiDirectional (BiDi) screen, capable of both image capture and display, is inspired by emerging LCDs that use embedded optical sensors to detect multiple points of contact. Our key contribution is to exploit the spatial light modulation capability of LCDs to allow lensless imaging without interfering with display functionality. We switch between a display mode showing traditional graphics and a capture mode in which the backlight is disabled and the LCD displays a pinhole array or an equivalent tiled-broadband code. A large-format image sensor is placed slightly behind the liquid crystal layer. Together, the image sensor and LCD form a mask-based light field camera, capturing an array of images equivalent to that produced by a camera array spanning the display surface. The recovered multi-view orthographic imagery is used to passively estimate the depth of scene points. Two motivating applications are described: a hybrid touch plus gesture interaction and a light-gun mode for interacting with external light-emitting widgets. We show a working prototype that simulates the image sensor with a camera and diffuser, allowing interaction up to 50 cm in front of a modified 20.1 inch LCD.National Science Foundation (U.S.) (Grant CCF-0729126)Alfred P. Sloan Foundatio
Perceptual Requirements for World-Locked Rendering in AR and VR
Stereoscopic, head-tracked display systems can show users realistic,
world-locked virtual objects and environments. However, discrepancies between
the rendering pipeline and physical viewing conditions can lead to perceived
instability in the rendered content resulting in reduced realism, immersion,
and, potentially, visually-induced motion sickness. The requirements to achieve
perceptually stable world-locked rendering are unknown due to the challenge of
constructing a wide field of view, distortion-free display with highly accurate
head- and eye-tracking. In this work we introduce new hardware and software
built upon recently introduced hardware and present a system capable of
rendering virtual objects over real-world references without perceivable drift
under such constraints. The platform is used to study acceptable errors in
render camera position for world-locked rendering in augmented and virtual
reality scenarios, where we find an order of magnitude difference in perceptual
sensitivity between them. We conclude by comparing study results with an
analytic model which examines changes to apparent depth and visual heading in
response to camera displacement errors. We identify visual heading as an
important consideration for world-locked rendering alongside depth errors from
incorrect disparity
- …